DP-space: Bayesian Nonparametric Subspace Clustering with Small-variance Asymptotics
نویسندگان
چکیده
Subspace clustering separates data points approximately lying on union of affine subspaces into several clusters. This paper presents a novel nonparametric Bayesian subspace clustering model that infers both the number of subspaces and the dimension of each subspace from the observed data. Though the posterior inference is hard, our model leads to a very efficient deterministic algorithm, DP-space, which retains the nonparametric ability under a smallvariance asymptotic analysis. DP-space monotonically minimizes an intuitive objective with an explicit tradeoff between data fitness and model complexity. Experimental results demonstrate that DP-space outperforms various competitors in terms of clustering accuracy and at the same time it is highly efficient.
منابع مشابه
Small-Variance Asymptotics for Exponential Family Dirichlet Process Mixture Models
Sampling and variational inference techniques are two standard methods for inference in probabilistic models, but for many problems, neither approach scales effectively to large-scale data. An alternative is to relax the probabilistic model into a non-probabilistic formulation which has a scalable associated algorithm. This can often be fulfilled by performing small-variance asymptotics, i.e., ...
متن کاملMAD-Bayes: MAP-based Asymptotic Derivations from Bayes
The classical mixture of Gaussians model is related to K-means via small-variance asymptotics: as the covariances of the Gaussians tend to zero, the negative log-likelihood of the mixture of Gaussians model approaches the K-means objective, and the EM algorithm approaches the K-means algorithm. Kulis & Jordan (2012) used this observation to obtain a novel K-means-like algorithm from a Gibbs sam...
متن کاملOnline Inference in Bayesian Non-Parametric Mixture Models under Small Variance Asymptotics
Adapting statistical learning models online with large scale streaming data is a challenging problem. Bayesian non-parametric mixture models provide flexibility in model selection, however, their widespread use is limited by the computational overhead of existing sampling-based and variational techniques for inference. This paper analyses the online inference problem in Bayesian non-parametricm...
متن کاملDetailed Derivations of Small-Variance Asymptotics for some Hierarchical Bayesian Nonparametric Models
Numerous flexible Bayesian nonparametric models and associated inference algorithms have been developed in recent years for solving problems such as clustering and time series analysis. However, simpler approaches such as k-means remain extremely popular due to their simplicity and scalability to the large-data setting. The k-means optimization problem can be viewed as the small-variance limit ...
متن کاملSmall-Variance Asymptotics for Bayesian Nonparametric Models with Constraints
The users often have additional knowledge when Bayesian nonparametric models (BNP) are employed, e.g. for clustering there may be prior knowledge that some of the data instances should be in the same cluster (must-link constraint) or in different clusters (cannot-link constraint), and similarly for topic modeling some words should be grouped together or separately because of an underlying seman...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015